Model selection in demographic time series using VC-bounds
نویسندگان
چکیده
The problem of distinguishing density-independent (DI) from density-dependent (DD) demographic time series is important for understanding the mechanisms that regulate populations of animals and plants. We address this problem in a novel way by means of Statistical Learning Theory (SLT); SLT is built around the idea of VC-dimension, a complexity index for classes of parameterized functions. Though VC-dimensions of nonlinear models are generally unknown, in the linear case VCdimension actually corresponds to the number of free parameters; this allows one to straightforwardly apply the model selection framework developed within SLT, and called Structural Risk Minimization (SRM). We generate noisy artificial time series, both DI and DD, and use SRM to recognize the model underlying the data, choosing among a suite of both density-dependent and independent demographies. We show that SRM significantly outperforms traditional model selection approaches, such as the Schwartz Information Criterion and Final Prediction Error in recognizing both density-dependence and independence.
منابع مشابه
Motion Prediction Using VC-Generalization Bounds
This paper describes a novel application of Statistical Learning Theory (SLT) for motion prediction. SLT provides analytical VC-generalization bounds for model selection; these bounds relate unknown prediction risk (generalization performance) and known quantities such as the number of training samples, empirical error, and a measure of model complexity called the VC-dimension. We use the VC-ge...
متن کاملVC-dimension and structural risk minimization for the analysis of nonlinear ecological models
The problem of distinguishing density-independent (DI) from density-dependent (DD) demographic time series is important for understanding the mechanisms that regulate populations of animals and plants. We address this problem in a novel way by means of Statistical Learning Theory. First, we estimate the VC-dimensions of the best known nonlinear ecological models through the methodology proposed...
متن کاملHigh-dimensional classification by sparse logistic regression
We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty ...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملFitting of Count Time Series Models on the Number of Patients Referred to Addiction Treatment Centers in Semnan County
Abstract. Count data over time are observed in many application areas. Many researchers use time series patterns to analyze this data. In this paper, the poisson count time series linear models and negative binomials on this type of data with the explanatory variables are studied. The Likelihood analysis and the evaluation of count time series model based on generalized linear models are pres...
متن کامل